A latent class analysis approach to the identification of doctoral students at risk of attrition

To advance understanding of doctoral student experiences and the high attrition rates among Science, Technology, Engineering, and Mathematics (STEM) doctoral students, we developed and examined the psychological profiles of different types of doctoral students. We used latent class analysis on self-reported psychological data relevant to psychological threat from 1,081 incoming doctoral students across three universities and found that the best-fitting model delineated four threat classes: Lowest Threat, Nonchalant, Engaged/Worried, and Highest Threat. These classes were associated with characteristics measured at the beginning of students’ first semester of graduate school that may influence attrition risk, including differences in academic preparation (e.g., amount of research experience), self-evaluations and perceived fit (e.g., sense of belonging), attitudes towards graduate school and academia (e.g., strength of motivation), and interpersonal relations (e.g., perceived social support). Lowest Threat students tended to report the most positive characteristics and Highest Threat students the most negative characteristics, whereas the results for Nonchalant and Engaged/Worried students were more mixed. Ultimately, we suggest that Engaged/Worried and Highest Threat students are at relatively high risk of attrition. Moreover, the demographic distributions of profiles differed, with members of groups more likely to face social identity threat (e.g., women) being overrepresented in a higher threat profile (i.e., Engaged/Worried students) and underrepresented in lower threat profiles (i.e., Lowest Threat and Nonchalant students). We conclude that doctoral students meaningfully vary in their psychological threat at the beginning of graduate study and suggest that these differences may portend divergent outcomes.


Introduction
Nearly half the doctoral students in Science, Technology, Engineering, and Mathematics (STEM) do not graduate [1]. Although students leave for a variety of reasons (e.g., financial members of underrepresented racial/ethnic minority (URM) groups in STEM fields. Negative stereotypes about the intelligence of women and URM members are well-known, and, along with other environmental characteristics, can generate the conditions where SIT becomes salient and contributes to achievement gaps over time [20][21][22][23][24][25]. We suggest that the disproportionate attrition rates for these groups in doctoral education may be partially explained by SIT. SIT is an additional form of psychological threat on top of the psychological threats other students may face. For example, it is likely normative among Ph.D. students to sometimes worry about being seen as incompetent (general threat). However, students who worry that others are negatively evaluating them due to their race or gender (SIT) carry an additional psychological burden that may help explain their greater attrition. In addition to negative stereotypes, one cue that amplifies SIT is perceiving oneself to be a numerical minority, relevant in doctoral education given the underrepresentation of certain groups in many STEM fields. SIT is also heightened when people are highly identified with a domain and want to succeed [26], both highly likely for Ph.D. students.
Doctoral students may thus contend with an array of psychological threats, both at the beginning of graduate school and throughout their studies. Importantly, some students may face general psychological threats and SIT, while others face one or none. If psychological threat in general, and SIT in particular, are risk factors in doctoral education, it would help to identify students who, based on their threat pattern at the start of their studies, may benefit from early intervention. To work toward this goal, we use latent class analysis to identify student profiles of psychological threat at the start of doctoral education.

Latent class analysis
As noted, many variables related to psychological threat and SIT help explain achievement gaps at pre-doctoral levels of education. One approach to identifying Ph.D. students at risk of attrition is to regress attrition or other relevant outcomes on many or all of these potentially influential variables, plus their products, to account for specific combinations that may best predict educational outcomes. However, this kind of variable-centered approach requires many predictors, making models unwieldy and introducing multicollinearity. Indeed, variable-centered approaches often cannot examine complex higher-order interactions due to issues like statistical power [27] and thus obscure meaningful patterns among individuals. Mixture models, like latent class analysis (LCA), offer an alternative.
Mixture models are person-centered and reveal subgroups of people who share similar responses on a set of observed variables [28]. LCA is a type of mixture model used when one suspects that an unobserved categorical variable separates a population into mutually exclusive and exhaustive subgroups, or latent classes [29]. LCA can be thought of as a data reduction tool, as it distills a great deal of information into identifiable patterns reflected in the sample. A regression analysis analogue to an LCA with eight 3-level categorical indicators (i.e., observed variables used to differentiate classes) would require 3 8 = 6,561 possible subgroups to examine every possible pattern of responses. With LCA, we can reduce these subgroups to a few meaningful ones. LCA can parsimoniously show how academic risk factors interact, including how prevalent different risks (and subgroups) are.
Moreover, we can explore how identified subgroups differ on constructs aside from the indicators by examining associations with proximal outcomes, or characteristics. For instance, we can examine if subgroups characterized by ostensibly more threatening patterns of traits feel more negatively about graduate school than lower threat subgroups. By using LCA in this way, we take a critical first step towards long-term goals of determining where interventions should be applied by identifying where students fall along a threat continuum [30].

Overview of current research
In the present research, we use data from the Study for the Advancement of Graduate Education and Scholarship (SAGES) to better understand the psychological experiences of Ph.D. students at the start of graduate school. SAGES is a prospective multisite study of the psychological experiences of Ph.D. students that predict attrition and retention, particularly in STEM fields. We conducted an unrestricted LCA to delineate threat profiles (AKA classes) using data from a baseline survey completed by incoming Ph.D. students. Then, we examined how the selected model mapped onto student demographics, expecting differences consistent with past literature. For instance, female Ph.D. students in many STEM fields may experience identity threat and thus, be overrepresented in high SIT profiles. To better understand the profiles, we also examined differences between them on proximal indicators of threat important for academic success. Our goal was to identify profiles of incoming Ph.D. students with an eye towards understanding how these may have different risks for negative outcomes like attrition.

Method
The overall design and hypotheses of SAGES were pre-registered (https://bit.ly/3hNrjPL) including the hypothesis that women, first-generation, and URM students will have higher levels of psychological threat at the start of graduate school and a higher risk of attrition. We also outlined the use of LCA to create a composite threat variable based on past literature. Materials, data, and code are available at https://bit.ly/3hNrjPL.

Participants
Participants were two cohorts of first-semester Ph.D. students at three universities (Penn State, Columbia, and Stanford) who completed a baseline survey in Fall 2018 or 2019. All first-year STEM Ph.D. students were eligible to participate, and a smaller number of non-STEM students were targeted at Penn State. Results are based on the 1,081 students who completed the baseline survey (see Table 1). The sample of 1,081 includes three students with missing responses on one or more indicators because LCA uses maximum likelihood estimation, which allows partial data on indicators. However, this sample excludes 44 students who did not complete the baseline survey (i.e., who stopped responding before the end) but still provided data on LCA indicators. We conducted the final unrestricted 4-class LCA including these students (total N = 1,125) and found the same latent classes and a similar percentage of people in each class (see S1 Appendix).
Participants' age ranged from 19 to 55 (M = 24.17) and their gender was nearly evenly female (522) and male (548), with 11 identifying as genderqueer. Where data were available, we found that this sample was reasonably representative of the population, with the most notable difference being that women were overrepresented in our Cohort 1 Stanford sample (see S2 Appendix for details).

Procedures
Written Institutional Review Board (IRB) approval for SAGES was obtained at all sites (STUDY00007231, IRB-AAAR3748, and 28910 for Penn State, Columbia, and Stanford, respectively). We primarily recruited students by email but also attended in-person orientation sessions for new Ph.D. students. Recruitment details and timing varied slightly across campuses (see S3 Appendix), but generally started two weeks before the school year's start and concluded two weeks after classes began. We described the study as about understanding Ph.D. student experiences. Recruitment materials included a link to a baseline survey to assess students' thoughts, behaviors, and characteristics before starting graduate school. Participants who completed the 45-minute baseline survey were paid $15 and invited to participate in the longitudinal part of SAGES (not reported here). Students could participate in the baseline survey and not the longitudinal study. The recruitment procedure was similar across campuses. Most students were recruited with help from university administrators, who sent a recruitment email written by the research team to all incoming Ph.D. students in identified STEM fields (see https://bit.ly/3hNrjPL for a list), and a small number of non-STEM fields at Penn State. The email included an introduction from each dean that noted support for the study but assured students that their participation would not be known by the university or affect their graduate career. At Stanford, we did not have a way to email incoming STEM Ph.D. students in one college, so we sent the recruitment email (without a dean's message) to departmental administrators in that college with a request to forward the message to incoming Ph.D. students. We also attended in-person orientations for Ph.D. students at Stanford and Penn State where we handed out fliers. Students outside of the targeted fields could participate if they became aware of the study in this way. We did not have information about non-participating students, but the Graduate School at Penn State provided limited demographic data for all incoming students by field.

Measures
To develop a comprehensive set of measures, we spent several months identifying constructs relevant to psychological threat and persistence in academic settings and conducted an exhaustive literature review to identify the most relevant and psychometrically sound scales. We often selected measures used in research about lower levels of education given scarcity of research on doctoral education. We consulted an advisory board convened for this project and other experts in psychology, education, and STEM graduate education and disparities. We pilot tested measures with Ph.D. students to ensure clarity and face validity. To capture a comprehensive set of constructs relevant to psychological threat and attrition and in line with other longitudinal studies [31], we shortened scales where possible, basing decisions on psychometric properties (e.g., dropping items with lowest factor loadings) and face validity. We also changed response scales in some cases to be consistent with other measures.
Below, we list measures used as indicators for the LCA and as proximal outcomes to further characterize and validate the observed classes. We start with demographics and then describe measures by their placement in one of six superordinate categories we created to promote organization: academic preparation and context, self-evaluations and perceived fit, academic identity and graduate school attitudes, interpersonal relations, mental health, and SIT. As described below, we selected measures based on their association with psychological threat and/or attrition risk. We examined threat-relevant measures not used as indicators as proximal characteristics to enrich understanding of each class's psychological state (see Analytic Strategy). Unless otherwise noted, we averaged scale items; higher scores indicate more of the construct. See Table 2 for descriptive statistics. Table 1 provides demographic characteristics. Participants reported gender identity by selecting one or more of female, male, trans female, trans male, genderqueer or non-conforming, or entering a response [32]. For sexual orientation, participants selected among heterosexual or straight, gay or lesbian, asexual, bisexual, or entered a response. For race/ethnicity, participants self-reported in an open-text response and, on another page, selected any of several categories that applied. For first-generation status, participants reported whether they considered themselves a first-generation college student. For socioeconomic status (SES), using one of the Mac-Arthur Scales of Subjective Social Status, participants selected a rung on a ladder from 1 to 10 to represent where they currently stood relative to others in the United States [33]. We asked students their country of birth, citizenship status, and for U.S. citizens or permanent residents not born in the U.S., the age at which they entered. We used this information to categorize participants as international or domestic (see S4 Appendix for details).

Demographic characteristics
Academic preparation and context. For academic preparation, we examined students' previous research and educational experiences, which have been shown to be associated with persistence in doctoral education [34], and how well students felt prepared for graduate school, as high preparedness, particularly among women and minority Ph.D. students, has been linked to higher rates of publishing [18]. For context, we examined the percentage of women in each field, which can speak to the potential for gender-based SIT, and English proficiency, which can speak to the potential for threat based on a language barrier. Percentage women in field. We approximated the percentage of women in each field using the NSF Survey of Earned Doctorates [35]. See S5 Appendix for details. This contextualizing information sheds light on whether women are underrepresented in a given field, a contributor to gender-based SIT.
Years undergrad/postgrad research. Students reported the number of years of research experience they had during and after college.
Undergrad/postgrad research prep. Author-generated items assessed how well students felt that their previous research experience during and after college (if applicable) had prepared them for doctoral education, from 1 (not well at all) to 5 (extremely well). Research suggests that feeling unprepared for doctoral education is a contributor to attrition [36].
Has master's. One dichotomous (yes/no) item assessed whether participants had received a terminal master's degree prior to starting their doctoral program.
English proficiency. Students who reported being non-native in English were asked how well they spoke English from 1 (Little or no English: No proficiency) to 6 (Complete fluency).
Self-evaluations and perceived fit. Neuroticism, self-esteem, and self-efficacy are theorized to share the same underlying construct, which can be called core self-evaluations [37]. We were interested in these measures as well as academic and social concerns, grit, psychological need satisfaction, academic belonging, and belonging uncertainty as other forms of selfevaluation and perceived fit. More positive self-evaluations are linked to greater task persistence and performance [38,39].
Neuroticism. Three neuroticism items were taken from John et al.'s [40] measure of Big Five personality (e.g., "I am someone who worries a lot") and rated from 1 (strongly disagree) to 7 (strongly agree) (α = .76). Higher levels of neuroticism predict worse mental and physical health outcomes [41].
Self-esteem. A well-established single-item scale assessed self-esteem, "I have high selfesteem" [42]. Responses ranged from 1 (not at all true) to 5 (very true).
Graduate school self-efficacy. Three items adapted from Shryock and Froyd's [43] 8-item engineering self-efficacy scale assessed graduate school self-efficacy (e.g., "I expect to do well in graduate school"). Students responded from 1 (not at all true of me) to 4 (very true of me) (α = .79). Higher self-efficacy generally predicts better college performance [13].
Academic and social concerns. Four items, adapted from Cohen and Garcia [44] to reflect a graduate school context, tap academic and social concerns, or worries, about being negatively evaluated (e.g., "I worry that people in my graduate program will think I'm dumb if I do badly"). Participants responded from 1 (strongly disagree) to 7 (strongly agree) (α = .77). High levels of worry, measured in various ways, can worsen performance [45,46].
Grit. Eight items adapted from Duckworth and Quinn [47] assessed grit (e.g., "I finish whatever I begin") on a scale from 1 (not like me at all) to 5 (very much like me) (α = .77). Higher levels of grit have predicted higher grades, a greater sense of belonging, and more college satisfaction among undergraduates [11].
Psychological need satisfaction. This scale includes three subscales based on self-determination theory: the need for competence (e.g., "I successfully complete difficult tasks and projects"), relatedness (e.g., "I feel close and connected with other people who are important to me"), and autonomy (e.g., "I am free to do things my own way"). Students responded from 1 (not at all true) to 7 (extremely true). The 18 items, 6 for each subscale, were adapted from Sheldon and Hilpert [48] to measure current, rather than past, psychological experiences. To limit survey length, we removed two items from each subscale for the second cohort, so we averaged only the 12 items completed in both cohorts (α = .78). Greater need satisfaction, as measured by similar scales, predicts greater academic engagement [49].
Academic belonging. We adapted the nine items from the Social and Academic Fit Scale [24] used by Cook et al. [21], to reflect a graduate context (α = .81). Five items assessed social belonging (e.g., "People in my program accept me") and four items assessed potential to succeed (e.g., "I know what I need to do to succeed in grad school"). Students responded from 1 (strongly disagree) to 7 (strongly agree). Greater sense of belonging, measured in various ways, is linked to better academic outcomes [14].
Belonging uncertainty. Three items adapted from Walton and Cohen [24] captured belonging uncertainty from 1 (strongly disagree) to 7 (strongly agree): "Sometimes I feel like I belong in grad school and sometimes I feel like I don't belong," "When something good happens, I feel like I really belong in grad school," and "When something bad happens, I feel like maybe I don't belong in grad school." Students answered based on their experiences in graduate school so far. Reliability was somewhat low (α = .61), but items 1 and 3 were adequately correlated (r = .62), so we present results for the average of these in the main text and for the individual items in the S6 Appendix. Greater uncertainty about belonging undermines academic motivation and performance for stigmatized group members [24].
Academic identity and graduate school attitudes. This category includes interest, academic identification, confidence in the decision to pursue a Ph.D., strength of motivation to finish the Ph.D., academic self-control, impostor syndrome, and preference for an academic career. Academic identity and graduate school attitudes can predict academic persistence [50].
Interest. We assessed interest in research (e.g., "I am interested in my research topic") and field (e.g., "I am interested in learning more about my field of study") with three of four items from Choe et al. [51], rephrased to apply across fields. Participants responded from 1 (strongly disagree) to 7 (strongly agree) (α = .84). Greater interest, measured similarly, predicts retention in undergraduate STEM fields [52].
Researcher identification. We assessed identification as a researcher using three items adapted from Sellers and colleagues' centrality subscale [53] and two items from Choe et al. [51]. Participants responded from 1 (strongly disagree) to 7 (strongly agree) (α = .80). Higher levels of identification with one's studies, measured in various ways, are associated with greater academic persistence at the undergraduate level [54].
Decision confidence. Students responded to two author-generated items on how often they felt they made the right choice in pursuing a Ph.D. from 1 (never) to 4 (always) and if they ever doubted this decision from 1 (I never doubt my decision) to 4 (I frequently doubt my decision). We reverse-coded the latter item and averaged the two given their adequate correlation (r = .65).
Strength of motivation. Four items, taken from a 16-item scale [55] and reworded for a graduate school context, captured strength of motivation to continue the Ph.D. (e.g., "Even if I could hardly maintain my social life, I would still continue graduate school"). Students responded from 1 (strongly disagree) to 7 (strongly agree). Reliability was low (α = .55), but analysis of individual items yields the same pattern (see S6 Appendix).
Academic self-control. Academic self-control was measured with two of four items from Yeager et al. [56] (e.g., "I pay attention and resist distraction in my work"), rephrased for a graduate student population. Participants responded from 1 (not at all like me) to 5 (very much like me). We averaged the items because they were adequately correlated (r = .60).
Impostor syndrome. We selected 5 of the 20 items from the Clance Impostor Phenomenon Scale [57] and added a face-valid, author-generated item ("Sometimes I feel like a fraud"). Participants responded from 1 (not at all like me) to 5 (very much like me) (α = .79). Impostor syndrome is common and does not preclude achievement, but it does predict worse psychological well-being, including higher burnout and anxiety [58].
Academic career preference. Participants indicated the strength of their preference for an academic job after graduating on a sliding scale, anchored at -5 (strongly prefer non-academic), 0 (equal preference), and 5 (strongly prefer academic).
Interpersonal relations. We assessed characteristics related to students' interpersonal lives, given that social support and integration predict academic persistence at the undergraduate level and in doctoral programs [59,60].
Perceived social support. We measured perceived social support using a single item [61]. Participants indicated how true it was that "There are people I can count on to support me" from 1 (not at all true) to 5 (extremely true).
Similarity to colleagues. A single author-generated item captured perceived similarity to colleagues, "How similar or different to other people in your department do you see yourself?" Responses ranged from 1 (Very different) to 6 (Very similar).
Mental health. Our measure of mental health was adapted from Kessler and colleagues' scale of psychological distress [62], which assessed how often participants felt nervous, hopeless, restless or fidgety, so depressed that nothing could cheer them up, that everything was an effort, and worthless during the past 30 days, from 0 (none of the time) to 4 (all of the time). We summed items (α = .85) to create a scale score. Higher scores on the Kessler scale predict lower academic achievement [63].

Social identity threat
We include measures of stereotype threat and identity interference to assess SIT.
Stereotype threat. Adapted from Cohen and Garcia [44], this scale has six items that assess stereotype threat, which we modified for a graduate school context (e.g., "I worry that people in my graduate program will judge me based on what they think of my racial group [people of my gender]"). Students responded from 1 (strongly disagree) to 7 (strongly agree). Racial (α = .91) and gender (α = .91) stereotype threat were separated. Greater stereotype threat predicts worse academic performance for women and URM students [17].
Identity interference. We adapted four items from Settles' [64] 17-item scale of identity interference, which is when one identity conflicts with another, specifically gender identity and science/researcher identity in this context (e.g., "I feel that other researchers do not take me seriously because of my gender"). Students responded from 1 (strongly disagree) to 7 (strongly agree). Higher levels of identity interference on this scale have predicted lower selfesteem and lower perceived science performance [64,65]. We removed one item ("I feel that because of my gender, it is easier for me to fit the definition of a researcher") to increase consistency from α = .68 to α = .82 (see S6 Appendix).

Analytic strategy
For the delineation of classes, we considered several indicator variables and used an iterative process to select a useful set for developing an interpretable and parsimonious model. By design, our goal was to use theory to guide the selection of indicator variables (i.e., those related to psychological threat) and then to empirically derive the best set based on quantitative fit and interpretability of classes. We used LCA instead of latent profile analysis (LPA), a similar technique appropriate when indicators are continuous, because LPA has strict assumptions (e.g., that indicators be normally distributed; see [28]) that create model fit and stability issues in the commonly encountered situation where assumptions are violated. Indeed, we encountered these exact issues, suggesting LPA was not suitable for our data. Thus, we proceeded with LCA, and to do so, we trichotomized the eight indicators that were ultimately selected (see S7 Appendix). Where possible, we created categories based on meaningful scale response options. For instance, for distress we used the cutoff for clinical concern (13) in our trichotomization. For variables that were positively skewed (i.e., interest, researcher identification; see Fig 1), we trichotomized so that each group had a sufficient sample size.
We conducted LCAs with different numbers and combinations of indicator variables related to psychological threat and indicated by past literature as relevant to academic outcomes (see S8 Appendix for considered indicators). We sought indicators that distinguished between classes (i.e., threat profiles) and for a model that was stable and theoretically interpretable. For instance, we eliminated growth mindset and race-based stereotype threat as potential indicators because although theoretically meaningful and relevant, they did not distinguish between classes well. Ultimately, our final model had eight indicators and four classes. Two of our eight indicator variables, need fulfillment and gender threat, were composite variables suggested by high intercorrelations among predictors that led us to conduct a principal components analysis to try and reduce the number of individual indicator variables (see S9 Appendix for details). Need fulfillment was comprised of academic belonging, graduate school self-efficacy, and psychological need satisfaction. Gender threat was comprised of genderbased stereotype threat and identity interference. Composite variables reduce the number of indicators, which helps facilitate model fit and avoid redundant indicator variables.
To better understand the best fitting LCA model, our analysis strategy next turned to identifying who is in each class, that is, how the classes differed by demographic variables often used as proxies for psychologically threatened groups (e.g., gender). Our goal was to test whether our interpretations of the risk level of the classes mapped onto these demographic variables in expected ways. We chose the following demographic variables to align with this goal: gender, sexual orientation, race/ethnicity, SES, first-generation student status, and international student status. We expected women, sexual minorities, and low-SES/first-generation students to be overrepresented in higher threat, particularly higher SIT, classes (and underrepresented in lower threat classes) given the potential of these groups to face psychological threats beyond what dominant groups encounter. We included race/ethnicity with a similar rationale, but the relatively small and racially heterogeneous URM sample inhibits interpretability. We included international student status for exploratory purposes.
To analyze class differences in demographic variables, we used the BCH procedure within the LCA framework, which is recommended for examining how LCA-derived classes predict outcomes [66]. The BCH procedure uses linear and logistic regression to predict outcomes from class membership accounting for measurement-error weighting associated with assigning people to their most likely class. The procedure can be used cross-sectionally and does not assume an antecedent-consequent structure. The most common alternative approach, classifyanalyze, entails assigning individuals to classes without measurement-error weighting, which is contraindicated [67].
Because the BCH technique requires categorical correlates of the classes to be binary, we recoded and converted demographic variables where necessary. We collapsed gender into male (0) and not male (1), given that only 11 students identified exclusively as genderqueer (see Table 1) and the potential for both female and genderqueer identities to suffer heightened psychological threat [68]. Results did not meaningfully differ when only male and female identified students were included. We collapsed the categories for sexual orientation into heterosexual (0) and queer (1), used here to denote all non-heterosexual identities. We compared continuing generation (0) to first-generation students (1). We dichotomized SES, such that students who saw themselves as average or above average, that is, at or above the scale midpoint (� 5; 0), were compared to those who saw themselves as below average (< 5; 1). We coded race/ethnicity such that non-URM students (White, Asian, or White/Asian and international; 0) were compared to URM students (at least partially Native American, Hispanic, or Black, and domestic; 1). We coded this way because Native American, Hispanic, and Black students are underrepresented among people with STEM doctoral degrees whereas White and Asian students are not [1]. We did not group international students with URM regardless of race/ethnicity given the unique racial context of the United States, which may not apply to international students. Most international students were Asian, which would preclude them from categorization as URM students regardless. Of course, international students may experience graduate school differently than domestic students, which we tested with a variable comparing domestic students (0) to international students (1).
In addition to examining demographics, we examined campus differences by class to investigate the potential for the local context to influence the proportions of students falling into each class. For instance, perhaps students in a rural setting (e.g., Penn State) worry about finding community and are overrepresented in higher threat classes. To investigate campus, we ran three analyses using the BCH procedure, each analysis with a different binary-coded campus variable (e.g., Penn State and not Penn State, with the latter as the reference group).
The final step in our analysis strategy was to characterize the classes more fully by testing their association with theory-relevant constructs. This step is important because not every construct relevant to academic outcomes can be used as an indicator. For instance, although we did not use race-based stereotype threat as an indicator given it contributed relatively less to a clear class makeup than other variables, the classes may still vary in race-based stereotype threat, which can have implications for overall risk level. Using the recommended BCH technique, we tested for class differences in the continuous indicator variables (i.e., pre-trichotomization), including the components of the two composites, as well as other relevant constructs.

Model selection
We conducted latent class analyses in Mplus version 8.4 [69] after preparing data files in RStudio version 1.3.1093 [70] using the MplusAutomation package [71]. We evaluated model identification using 1,000 sets of random initial stage starting values and 500 final stage starts. We specified models that varied in the number of indicators, and within those model sets we varied the number of classes and then empirically evaluated relative fit using (1) the Bayesian information criterion (BIC; [72]), (2) the sample-size adjusted BIC (aBIC; [73]), (3) the Akaike information criterion (AIC; [74]), (4) the bootstrapped likelihood ratio test (BLRT; see [75]), and (5) the Vuong-Lo-Mendell-Rubin adjusted likelihood ratio test (VLMR-LRT; see [75]). Lower values for BIC, aBIC, and AIC indicate relatively better balance between parsimony and model fit. We emphasized the BIC and BLRT in particular given evidence showing their unique strength in identifying the ideal number of classes [75]. We also considered absolute model fit (e.g., the G 2 likelihood-ratio chi-square statistic; [76]). In addition to evaluating empirical strength, we emphasized theoretical interpretability in model selection [76].
The iterative process of model specification and evaluation ultimately resulted in one interpretable, robust model with eight indicators and four classes. Table 3 presents fit indices for models ranging from 1 to 8 classes using the eight indicators that ultimately proved most important to model specification. Table 3 also presents entropy values for these models. Entropy is a measure of class separation for which higher values indicate greater separability and higher classification utility. Entropy was not used as a model selection statistic, as its utility is in identifying problems with overextraction rather than distinguishing well between the appropriateness of models with different numbers of classes [77]. Table 4 presents average posterior probabilities from the selected 4-class model, which are the average of each individual's probability of membership in each class. Higher average posterior probabilities reflect greater certainty that members of a class are assigned correctly.
The two kinds of parameters estimated in LCAs are latent class prevalences (i.e., the proportion of the sample in each class) and item response probabilities, which represent the probability of answering a certain way given membership in each class-these parameters thus reveal the size and core traits of each class (see Fig 2).

Model description
In this section, we present the description of each of the four classes in the selected model in order from lowest to highest threat based on our interpretations of the classes.
Lowest Threat class. Class 1, about 21% of the sample, has the least psychological threat. Most Class 1 students are high in need fulfillment, which reflects academic belonging, graduate school self-efficacy, and psychological need satisfaction. The vast majority also report low levels of gender threat, psychological distress, and academic and social concerns, and high levels of grit and interest. Compared to the other classes, Class 1 students report the least impostor syndrome, with over half (59%) having low impostor syndrome. Only 21% reported, on average, a response of less than "somewhat agree" to items assessing researcher identification. Hence, most of these students feel at least moderately identified with their work, and in relative terms, they feel more strongly identified than students in Classes 2 or 4. Overall, Class 1

PLOS ONE
students appear psychologically prepared for and engaged with graduate school with few concerns. We label Class 1 as Lowest Threat. Nonchalant class. Class 2, about 29% of the sample, is distinguished by the vast majority of its students scoring (1) in the middle on need fulfillment, (2) low on gender threat, and (3) low on distress. These students face minimal SIT and distress; however, their levels of academic and social concerns vary, with nearly half (45%) not feeling strongly either way about whether they are worried about others perceiving them negatively. Similarly, over half (54%) have moderate levels of impostor syndrome. Class 2 students either report medium or high grit and interest and medium or low researcher identification. Class 2 students vary, but overall do not seem to be particularly high threat based on their response patterns; however, they are not as confident as the Lowest Threat class given their responses tend more towards the middle. Given this pattern and the general lack of strong positive or negative feelings, we label Class 2 as Nonchalant.
Engaged/Worried class. Class 3, the largest class at 30% of the sample, is distinguished by having most of its members score high on (1) academic and social concerns and (2) impostor syndrome, but (3) also interest. There is more variation on the need fulfillment composite than in the other classes, although most (55%) members fall in the middle. Although about half of the students in Class 3 report low gender threat, there is more gender threat in this class than in the others, with 30% reporting relatively high levels. Class 3 students tended to be low or medium on distress and medium or high on grit. They are most like the Lowest Threat class on researcher identification. Overall, students in Class 3 appear to face some psychological threat, including the highest levels of SIT, but also seem highly engaged in their studies. We label Class 3 as Engaged/Worried. Highest Threat class. Class 4, the smallest class at about 20% of the sample, consists of students who, for the most part, are low on need fulfillment, high in academic and social concerns, and low in researcher identification. Although the majority (57%) report minimal gender threat, more students in this class than those in the Nonchalant or Lowest Threat classes experience moderate to high levels of gender threat. Class 4 students vary in their psychological distress, with almost half reporting a moderate amount and 19% reporting clinically concerning levels, which is more than any of the other classes. The majority report high levels of impostor syndrome, with most others reporting medium levels. A higher proportion of Class 4 students than in the other classes scores low on grit, and fewer score high on grit, although most in Class 4 report a medium level. The vast majority of students in Class 4 score low or medium on interest, which is also lower than the other classes. Relatively speaking, students in this class have the most psychological threat. We label Class 4 as Highest Threat.
Who is in each class?. Table 5 presents class differences on demographics. The BCH procedure provides a chi-square test of the difference between classes on each variable, along with estimated means by class. Because of how the demographic variables are coded, these estimated means represent the proportion of students in a particular demographic category. The BCH procedure also provides significance tests for pairwise comparisons between classes (e.g., does the Engaged/Worried class have a significantly higher proportion of women than the Lowest Threat class?). Rate" column refers to the number/proportion of the entire sample (i.e., across classes) that belongs to the indicated group (e.g., female/genderqueer).
The classes differed by gender, sexual orientation, SES, and international status, but not by first-generation status, race/ethnicity, or campus (see Table 5). We expected relative overrepresentation of potential at-risk social categories (e.g., women) in the higher risk classes, particularly any associated with SIT, and relative underrepresentation in lower risk classes. We did not find this pattern reflected in the Highest Threat class, suggesting the need to look beyond group membership when assessing at-risk students, but we did find this in the Engaged/Worried class (i.e., the class highest in SIT). The Engaged/Worried class has overrepresentation of female/genderqueer and queer identities, suggesting that this class is characterized by relatively high SIT, which coincides with the initial description (i.e., they had relatively high gender threat). The Lowest Threat class is underrepresented in terms of low-SES students and somewhat underrepresented in terms of international students; we expected this class to have fewer students who are in at-risk social categories, so the SES finding is unsurprising. The Nonchalant class, and to a lesser extent the Highest Threat class, have slight overrepresentation of international students. We did not have expectations for international students, but their overrepresentation in the Highest Threat class may indicate that some international students are at heightened risk. Overall, results provide some evidence for our predictions on demographic group representation.

Descriptive information about risk by class
To further understand the implications of class membership for student risk levels, we examined how class membership was related to academic and attrition-relevant proximal characteristics. For these analyses, we used the same approach as we did for the demographic variables, the automated BCH procedure that produces Chi-square results. Class membership was associated with all examined characteristics. Table 6 shows an overview of expected risk level by class and Table 7 shows results. We present the means by class visually in the S10 Appendix.

Lowest Threat class
Overall, at matriculation, Lowest Threat students fared better than the other classes.
Lowest Threat students were very academically prepared. For instance, compared to others, they found their prior research experience most helpful in preparing them for graduate school. These students also reported the most positive attitudes towards graduate school and academia; they felt the surest about pursuing a Ph.D. and were highly motivated to finish. They similarly had the most positive perceptions of their interpersonal relations. Aligned with their initial description as self-assured, Lowest Threat students also had the most positive self-evaluations and perceptions of fit, with the least neuroticism and uncertainty about belonging, and the highest self-efficacy, self-esteem, and belonging. These students also reported the least distress and race-based stereotype threat and low levels of gender-based SIT. Taken together, results support the notion that Lowest Threat students are at low risk of adverse outcomes like attrition and may foreshadow relatively positive academic outcomes.
Nonchalant class. Students in the Nonchalant class tend to fall in the middle in terms of threat compared to the other classes but are more like the Lowest Threat class than the other two classes. Nonchalant students were relatively well-prepared; for instance, this class had the highest rate of students entering with master's degrees. However, among non-native speakers, English proficiency was lower than Lowest Threat and Engaged/Worried students, which could cause challenges (e.g., with social integration or writing). Nonchalant students had mixed attitudes towards graduate school and academia; they felt less sure about pursuing a Ph.D. than Lowest Threat students but more sure than Highest Threat students, and they were less motivated to finish than Lowest Threat and Engaged/Worried students. Nonchalant students also had mixed characteristics regarding interpersonal relations. For instance, they perceived less social support than Lowest Threat and Engaged/Worried students but felt as similar to their colleagues as Lowest Threat students. Nonchalant students tended to be in between the Lowest Threat and Highest Threat classes on self-evaluations and perceived fit; on all self-evaluations, Nonchalant students fared better than or similarly to Engaged/Worried students. These students also reported more distress and stereotype threat than Lowest Threat students but less than the other classes.
Given these results, we would expect Nonchalant students to be at lower risk of adverse outcomes than Engaged/Worried and Highest Threat students but not to be entirely risk-free.
Engaged/Worried class. Overall, results support the assessment of the Engaged/Worried class as the class facing the second-highest levels of psychological threat.
Engaged/Worried students were fairly well-prepared academically. Students in this class had a similar amount of undergraduate research experience as Lowest Threat students but felt significantly less prepared by it. This discrepancy may indicate that Engaged/Worried students feel less confident than their objective qualifications warrant. This finding may be a sign of SIT, particularly considering that the majority of this class is female, and is consistent with women (vs men) tending to be underconfident in STEM fields [39]. Indeed, Engaged/Worried students reported the most gender-based SIT, although the average was below the scale midpoint. Engaged/Worried students also had mixed attitudes towards graduate school and academia and mixed perceptions of interpersonal relations that further suggest their high engagement with graduate school alongside self-doubts. For instance, they reported higher motivation to finish than Nonchalant and Highest Threat students and perceived similarly high levels of social support as Lowest Threat students but had the most impostor syndrome (above the scale midpoint) and saw themselves as less like their colleagues than Lowest Threat and Nonchalant students (albeit above the midpoint). Engaged/Worried students also had somewhat poor self-evaluations, perceptions of fit, and mental health; they reported the most academic and social concerns (above the scale midpoint) and fared worse than Nonchalant and Lowest Threat students on self-esteem (below the midpoint) and distress.
Overall, Engaged/Worried students seem to be struggling more with threat than Nonchalant students, although it is possible that their high engagement is protective against other sources of threat, or that their high neuroticism and other forms of worry are beneficial (e.g., they motivate rather than immobilize). However, the relatively high threat levels of these students at the beginning of graduate school may predict suboptimal academic or well-being outcomes.
Highest Threat class. Results suggest that Highest Threat students are struggling at the start of graduate school.
Highest Threat students were pursuing their Ph.D.'s in fields that are on average~36% female, despite the class being evenly split by gender (51% male). Thus, female students in this class may tend to be underrepresented in their fields of study, which can create SIT. Indeed, Highest Threat students reported relatively high levels of gender-based SIT; they also reported the highest levels of race-based stereotype threat (albeit below the scale midpoint). Additionally, non-native English speakers in this class reported the least English proficiency, which, as noted previously, could pose challenges. Highest Threat students were also least prepared, with the least postgraduate prior research experience and the lowest sense of preparedness from prior experience. They also reported the most negative attitudes towards graduate school/academia and the most negative interpersonal relations. For instance, they felt the least interest in their work, the most doubt about pursuing a Ph.D., the least supported, and the least like their colleagues. These students also evaluated themselves the most negatively and as having the worst fit, with the lowest self-esteem, self-efficacy, and sense of belonging. Moreover, although their average did not reach clinical thresholds, Highest Threat students reported more distress than others. Notably, although some class differences are relative, others may be concerning on a more absolute level: Highest Threat students scored above the scale midpoints on impostor syndrome, neuroticism, academic and social concerns, and belonging uncertainty, and they scored below the midpoints on feelings of preparedness, academic self-control, similarity to colleagues, self-efficacy, and self-esteem.

Discussion
Using latent class analysis (LCA), we identified four classes of incoming Ph.D. students that varied in psychological threat experiences. Even among this elite group of students-who represent a critical population for the advancement of knowledge-there is a meaningful proportion who begin graduate school in a seemingly tenuous psychological state. We highlight the possibility of identifying such students early and tailoring interventions to prevent attrition.
We found that two classes reported fairly positive psychological states at matriculation. Students in the Lowest Threat class consistently reported the most auspicious characteristics and given their all-around low-threat psychological state, seem to be at low risk of adverse outcomes. Students in the Nonchalant class were like those in the Lowest Threat class but had less propitious characteristics. Nonchalant students were neither greatly worried about nor greatly engaged in graduate school, although they had generally more positive than negative characteristics. For instance, they were well-prepared for graduate school and had positive self-evaluations but had mixed attitudes towards graduate school. They also reported having relatively low social support, perhaps linked to the overrepresentation of international students in this class. International students may face more difficulties integrating socially than non-international students, given cultural differences, geographic distance from family, and for some, language barriers [78]. Overall, we expect Nonchalant students to be at low risk of attrition from psychological threat, but it will be important to follow their trajectory to identify which forms of psychological threat are more or less impactful for Ph.D. student persistence.
The other two classes tended to report more negative psychological states. The Engaged/ Worried class was characterized by reports of high interest in and engagement with graduate school but also high threat, with concerns about fitting in and being judged. Although this class demonstrated high preparation, their relatively high negative affect, uncertainty, SIT, and distress may portend heightened attrition risk. In general, these students evaluated themselves rather negatively, which likely does not reflect their competencies given their prior research experience and acceptance into their programs. We expected female, queer, first-generation, and URM students to face more SIT than other students and for them to be overrepresented in higher threat classes; given these expectations and our similar pre-registered hypotheses, we found it unsurprising that the Engaged/Worried class had the most women and queer students and reported the most SIT. Although not the highest threat class, the Engaged/Worried class appears to be relatively high risk based on students' tendency to report inauspicious characteristics and psychological attributes. It is unclear whether these students' high engagement would offset other threats. One possibility is that these students will thrive in supportive environments but struggle in unsupportive environments without intervention. Thus, this class could be an important indicator of risk used to target intervention efforts.
The Highest Threat class had the least auspicious scores on most measures. At matriculation, Highest Threat students were least prepared, reported negative attitudes and self-evaluations, worried about their relationships and fit, felt highly distressed, and experienced relatively high SIT. They tended to be in fields with fewer women despite the class being split evenly by gender, which may contribute to their relatively high gender-based SIT. They had the most negative characteristics in most domains, and differences were not just relative; their average scores on many threat-relevant variables fell on the unfavorable side of the scale midpoint. We thus expect these students are at the highest risk of attrition. Accordingly, we suggest these students might have the most to gain from intervention and might particularly benefit from one that targets the breadth of their threats. For instance, they may benefit from an approach that targets preparedness and belonging, like a summer program that has built confidence and community for incoming undergraduates [79] and from social psychological interventions meant to bolster psychological resources.
Notably, the Highest Threat class was not overrepresented in terms of high-risk social categories. This suggests the importance of looking beyond group membership when designing ways to assess threat and reduce attrition for STEM Ph.D. students; indeed, the approach we took with LCA, instead of using social categories as threat proxies, seems to provide much nuance.
Moreover, our findings shed light on how specific psychological theories might provide helpful frameworks for investigating the experiences of doctoral students. For instance, the characterization of the Engaged/Worried class highlights that SIT theory is applicable among doctoral students and underscores the importance of investigating disparities by demographic group. At the same time, however, our results regarding the Highest Threat class demonstrate that students might be at-risk even if they do not face SIT specifically. Moreover, we find that our results are interesting to consider in the context of the Zone Model of Threat theory [30], which draws from the expectancy-value theory of achievement motivation [7] and suggests two key dimensions by which students can be identified for the purpose of tailoring interventions: (1) how much students value a domain and (2) what students expect about their ability to succeed in that domain. Binning and Browman [30] suggest that students who highly value a domain but feel worried about succeeding in it (i.e., one of the four threat zones) would benefit most from interventions that aim to better align student expectancies with motivations. We suggest that the Engaged/Worried class aligns with this threat zone, as students in this class were highly engaged with their studies but simultaneously worried about their ability to succeed. As such, our future work with SAGES data examining how the Engaged/Worried and other classes respond to expectancy-enhancing social psychological interventions will be an interesting test of the Zone Model of Threat theory.
Overall, our findings highlight the importance of attending to Ph.D. students' psychological experiences. Ph.D. students are important for universities and the future of research, as they become responsible for scientific discovery, theoretical innovation, and teaching the next generation. They are among the top students in the world, and their aptitude is thoroughly vetted before admission. Thus, it is distressing if these highly qualified scientific leaders leave, not due to inability, but due to psychological barriers. We found that two classes, half of the sample, reported relatively high psychological threat in many domains relevant to academic persistence, with one of these classes faring clearly worse than the rest at the start of their studies. In following up on these students, we will use the baseline threat profiles identified here to examine psychological changes throughout doctoral education based on differing starting points by class, as well as how different types of students respond to two distinct social-psychological interventions. Ultimately, it will be important to examine the utility of these classes for identifying which students may be more likely to attrit in the absence of intervention.
Limitations include generalizability constraints, in part due to the features of LCA, which can be difficult to generalize, as class construction is sample-specific. Although we conducted robustness analyses by cohort and campus that support generalizability (see S11 Appendix), we cannot conclude that the same classes would emerge in another sample or setting. We also had to trichotomize some indicator variables in sample-specific ways (i.e., using observed distributions), which could exacerbate overfitting. Similarly, our sample is specifically comprised of doctoral students at three relatively elite universities, and even if our findings are generalizable to similar universities, it is possible that our findings may not generalize to all universities. However, there is evidence to suggest that more prestigious universities produce disproportionately high numbers of faculty [80] and thus, regardless of how well our findings can speak to all types of university, our research may have important implications for the future of knowledge production and teaching.
Future research would benefit from finding a simpler way of identifying at-risk students that may be more easily implemented, even if it may lose some of the nuance of LCA. One option would be to reverse-engineer the classes by finding a way to reduce the number of indicators while maintaining sufficient complexity to recapitulate relevant outcomes and identify at-risk students. As we examine how these classes fare over time, we can better identify which features of each class are most critical to measure. Despite limitations, our research provides a framework for identifying the types of Ph.D. students who may be psychologically threatened at matriculation.

Conclusion
Using LCA among a large sample of incoming Ph.D. students, we identified four psychological threat profiles that may portend different risk levels of negative psychological and academic outcomes. Although future work is needed to determine long-term implications, it is notable that a substantial number of Ph.D. students face relatively high psychological threat at matriculation. These results are important for educators, administrators, and policymakers to help determine the support some students may need to ensure their-and in turn society'ssuccess.
Supporting information S1 Appendix. Unrestricted 4-class LCA using all available data.